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1.  INTRODUCTION 
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where  Xj  are  i«isda  standard  normal  variables  (isee^  mean  0 and  variance 
1),  and  where  Cj  and  aj  are  non-negative  constants,  arise  in  many 
problems  in  statistics.  A brief  survey  of  such  problems,  with  references, 

is  in  Jensen  and  Solomon  (1972).  We  employ  their  notation  in  this  report. 
An  important  class  of  problems,  where  there  has  been  much  recent  activity, 

concerns  asymptotic  theory  of  goodness  of  fit  tests.  Statistics  related 
to  Pearson's  chi-square,  when  unknown  parameters  are  estimated  by  various 
methods,  and  when  cells  are  allowed  to  be  data-dependent , have  asymptotic 
distributions  of  the  type;  for  a survey  see  Moore  (1976).  Statistics 

based  on  the  empirical  distribution  function  (EDF  statistics)  also  have 
asymptotic  distributions  of  this  type;  references  are  given  below. 

Exact  significance  points  of  Qj^,  for  selected  Cj^,  and  all 
= 0,  have  been  published,  for  k ■ 2 and  3,  by  Grad  and  Solomon  (1955) 
and  by  Solomon (( I960) , reproduced  in  Owen  (1962)),  and  for  k = U and  5,  by 
Johnson  and  Kotz  (1968);  Solomon  gives,  also,  tables  of  ** 

for  certain  values  of  t.  The  mathematical  difficulties  involved  in 
obtaining  exact  values  increase  with  k,  and  several  approximate  methods 
have  been  suggested.  The  most  accurate,  as  we  show  later,  appears  to  be 
that  of  Imhof  (1961),  In  irtiich  the  characteristic  function  of  the  dis- 

tribution is  inverted  ntimerically  to  give  Pjj(t)  for  given  t.  In  the 
calculations,  an  integral  with  an  infinite  upper  limit  must  be  calculated 
for  each  t.  Isdiof  gives  a bound  on  the  accuracy  obtained  when  this  upper 


limit  is  replaced  by  a finite  T.  This  is  apart  from  inaccuracies  in  the 
method  of  numerical  integration.  Thus  a measure  of  accuracy  can  be  attained. 
The  Iinhof  method  has  been  used  by  Durbin  and  Knott  (1972),  Durbin, 

ICnott  and  Taylor  (l975)  and  Pettltt  and  Stephens  (1975)  in  approximating 
distributions  of  EDF  statistics.  These  require  k to  be  infinite  and 
make  use  of  a modification  of  Imhof's  technique  suggested  by  Durbin  and 
Knott. 


2.  JJEW  SIGHIFICANCE  POINTS  FOR 

It  is  a convenience  to  have  exact  points  at  hand  vhenever  possible, 
so  for  selected  c^  , and  all  a^  ^ o , we  give  significance  points  of 
, for  k SB  6,  6,  10  , in  Table  1.  These  have  been  calculated  by 
Imhof's  method.  They  will  also  provide  useful  anchor  points  for  conq^ison 
with  other  approximations  already  in  the  literature  or  to  be  suggested  In 
Section  3.  In  Table  2,  among  several  conqparisons,  we  give  exact  values  of 
significance  points,  and  values  calculated  by  Imhof's  method  to  a high 
order  of  accuracy.  It  can  be  seen  that  the  Imhof  technique  gives  excellent 
results. 


3.  NEW  APPROXIMATIOHS  FOR  Qj^ 

There  is  clearly  a need  for  an  approximation  to  the  distribution 
for  problems  where  a^^  do  not  appear  in  existing  Tables;  such  would 
be  the  case  for  example,  in  the  distribution  theory  of  statistics  of 
the  chi-square  type  referred  to  above.  The  Imhof  method  can  be  almost 
regarded  as  exact,  but  it  does  not  adapt  easily  to  step-by-step  increases  in  t 
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matching  the  moments  does  not  even  insure  that  the  value  of  P (O)  = 0 , 
i.e,,  that  the  distribution  of  Qj^  ’’starts”  at  zero.  This  weakness  is 
shared  by  other  approximations,  including  that  of  Jensen  and  Solomon. 
However,  approximation  (2)  above  does  automatically  start  at  zero.  These 
considerations  suggest  it  will  be  better  as  an  approximation,  at  least  in 
the  lower  tail. 

3.1  The  three-moment  chi-»SQuare  fit 

The  distribution  of  is  to  be  fitted  by  = Aw’*  , where  w has 

2 

the  Xp  disijribution  and  the  constants  A,  p,  r will  be  found  by  matching 
moments.  Let  p/2  = V , and  let  C = r(v)  ; the  moments  of  z are  then 

p » E(z)  * A 2*‘{r(r+v)}/C  , 

Ug  » l(^{P(2r*Y))/C  , 

IJ3  * A^  8*’{r(3r  + v)}/C  . 

Define 

Rg  = u’/u^  * C r(2r+v)/{r{r  + v)}^ 

and 

Rj  ■ y^/U^  ■ r(3r  + v)/{r(r  + v)}^  . 

Given  R^  and  , these  equations  can  be  solved  for  r and  v and 

then  A is  obtained  from  the  expression  for  y . Computer  routines  are 

available  to  perform  these  operations  and  then  to  cedculate  probabilities 

2 

or  significance  points  of  x * even  with  non-integer  degrees  of  freedom. 

2 

Significance  points  for  x with  degrees  of  freedom  differing  by  0.2 


are  given  in  Pearson  and  Hartley  (1972). 

2 

The  X approximation  is  not  sensitive  io  employing  for  p the 
closest  integer  to  p . If  R2  ia  then  used  to  solve  for  r , a good 
approximation  will  often  result.  Further  details  on  this  method  of  approxi- 
mation, with  other  areas  of  application,  will  be  given  in  a later  report. 


Table  2 compares  significance  points  for  several  distributions 
for  which  exact  (E)  values  exist  in  Solomon  (i960)  or  Johnson  and  Kotz 
(1968);  the  approximations  are  Imhof's  (l),  the  Jensen-Solomon  approxi- 
mation (J),  and  the  new  approximations,  the  four-moment  Pearson  curve  fit 
(P)  emd  the  new  three-moment  x (S).  In  Table  3 these  comparisons 

are  continued  for  higher  values  of  k , for  which  no  exact  values  exist. 
In  these  tables,  the  values  of  Cj^  have  been  chosen  to  sum  either  to  1 
or  to  k , to  enable  compeurisons  to  be  made  directly  with  values  given 
in  the  references.  Jensen  and  Solomon  compare  values  of  rather 

than  significance  points,  so  in  Table  U we  give  some  comparisons  of  this 
type.  In  Tables  2,  3,  and  U,  values  of  and  are  given  in 

2 3 

snuare  brackets  beneath  the  values  of  c^  , where  8^  ^ ^ 

2 

measure  of  skewness  and  ^ ^ measure  of  kurtosia. 


3.3  Ccmments 

In  comparing  approximations  with  exact  values,  the  important  quantity 
is  o'  , the  exact  significance  level  realized  by  an  approximate  point 
calculated  for  significance  level  a . The  values  in  Table  2 show  immedi- 
ately the  excellent  accuracy  of  the  Imhof  technique,  and  values  in  Table  1 
obtained  by  Imhof’s  method,  and  quoted  again  in  Table  3,  may  for  practical 
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purposes  be  regarded  as  accurate.  Apart  from  this,  all  three  approxima- 
tions (P,  J,  S)  generally  give  excellent  accuracy  in  the  upper  tail.  In 
the  lower  tail,  the  Pearson  curves  often  are  very  poor,  especially  for 
higher  values  of  and  ; Tables  2 and  3 show  the  clear  supremacy 

of  3 over  P . Table  U suggests  that  S is  more  accurate  them  J in 
both  tails;  however,  all  approximations  (except  Imhof's)  become  relatively 
less  good  in  the  lower  tail  as  the  skewness  and  kurtosis  of  Increase. 

Jensen  and  Solomon  have  already  demonstrated  an  overall  supremacy  of  J 
over  other  approximations  discussed  by  them.  From  the  picture  presented 
here  S is  as  good  as  or  better  than  J and  this  merits  consideration 
as  an  approximation  if  only  three  moments  are  to  be  used  emd  especially 
for  accuracy  in  the  lower  tall.  Probability  levels  for  S are  easily 
obtained  all  along  the  curve  and  this-  ia  useful  in  combining  slgnlfleanec 
tests;  one  of  the  occasions  when  the  lover  tail  becomes  Important. 
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